Combining acoustic, lexical, and syntactic evidence for automatic unsupervised prosody labeling
نویسندگان
چکیده
Automatic labeling of prosodic events in speech has potentially significant implications for spoken language processing applications, and has received much attention over the years, especially after the introduction of annotation standards such as ToBI. Current labeling techniques are based on supervised learning, relying on the availability of a corpus that is annotated with the prosodic labels of interest in order to train the system. However, creating such resources is an expensive and time-consuming task. In this paper, we examine an unsupervised labeling algorithm for accent (prominence) and prosodic phrase boundary detection at the linguistic syllable level, and evaluate their performance on an standard, manually annotated corpus. We obtain labeling accuracies of 77.8% and 88.5% for the accent and boundary labeling tasks, respectively. These figures compare well against previously reported performance levels for supervised labelers.
منابع مشابه
Automatic Prosody Labeling Final Project Report for EE 6820 - Spring 05 Professor : Dan
Automatic transcription of prosody is necessary for spoken language understanding. Prominence and intonational boundaries are routinely used to convey meaning beyond that expressed in the lexical content of speech. Using a classiÞcation rule learning algorithm and computationally light acoustic and syntactic features, detection of pitch accent at 87% on spontaneous elicited speech were attained...
متن کاملUnsupervised Syntactic Chunking with Acoustic Cues: Computational Models for Prosodic Bootstrapping
Learning to group words into phrases without supervision is a hard task for NLP systems, but infants routinely accomplish it. We hypothesize that infants use acoustic cues to prosody, which NLP systems typically ignore. To evaluate the utility of prosodic information for phrase discovery, we present an HMMbased unsupervised chunker that learns from only transcribed words and raw acoustic correl...
متن کاملExploiting prosodic features for dialog act tagging in a discriminative modeling framework
Cue-based automatic dialog act tagging uses lexical, syntactic and prosodic knowledge in the identification of dialog acts. In this paper, we propose a discriminative framework for automatic dialog act tagging using maximum entropy modeling. We propose two schemes for integrating prosody in our modeling framework: (i) Syntaxbased categorical prosody prediction from an automatic prosody labeler,...
متن کاملCombining lexical, syntactic and prosodic cues for improved online dialog act tagging
Prosody is an important cue for identifying dialog acts. In this paper, we show that modeling the sequence of acoustic– prosodic values as n-gram features with a maximum entropy model for dialog act (DA) tagging can perform better than conventional approaches that use coarse representation of the prosodic contour through summative statistics of the prosodic contour. The proposed scheme for expl...
متن کاملExploiting Acoustic and Syntactic Features for Prosody Labeling in a Maximum Entropy Framework
In this paper we describe an automatic prosody labeling framework that exploits both language and speech information. We model the syntactic-prosodic information with a maximum entropy model that achieves an accuracy of 85.2% and 91.5% for pitch accent and boundary tone labeling on the Boston University Radio News corpus. We model the acousticprosodic stream with two different models, one a max...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006